# Speech-to-Text and Text-to-Speech Conversion
Ultravox V0 4 Llama 3 1 70b
MIT
Ultravox is a multimodal speech large language model, built upon the pre-trained Llama3.1-70B-Instruct and Whisper-medium backbones, capable of simultaneously receiving both speech and text as input.
Audio-to-Text
Transformers Supports Multiple Languages

U
fixie-ai
79
4
Hf Seamless M4t Large
SeamlessM4T is a unified model supporting multilingual speech and text translation, capable of performing speech-to-speech, speech-to-text, text-to-speech, and text-to-text translation tasks.
Text-to-Audio
Transformers

H
facebook
4,648
57
Hf Seamless M4t Medium
SeamlessM4T is a multilingual translation model that supports both speech and text input/output, enabling cross-language communication.
Text-to-Audio
Transformers

H
facebook
14.74k
30
Featured Recommended AI Models